PIR-ALN: a database of protein sequence alignments
نویسندگان
چکیده
MOTIVATION The Protein Information Resource (PIR) maintains a database of annotated and curated alignments in order to visually represent interrelationships among sequences in the PIR-International Protein Sequence Database, to spread and standardize protein names, features and keywords among members of a family or superfamily, and to aid us in classifying sequences, in identifying conserved regions, and in defining new homology domains. RESULTS Release 22.0, (December 1998), of the PIR-ALN database contains a total of 3806 alignments, including 1303 superfamily, 2131 family and 372 homology domain alignments. This is an appropriate dataset to develop and extract patterns, test profiles, train neural networks or build Hidden Markov Models (HMMs). These alignments can be used to standardize and spread annotation to newer members by homology, as well as to understand the modular architecture of multidomain proteins. PIR-ALN includes 529 alignments that can be used to develop patterns not represented in PROSITE, Blocks, PRINTS and Pfam databases. The ATLAS information retrieval system can be used to browse and query the PIR-ALN alignments. AVAILABILITY PIR-ALN is currently being distributed as a single ASCII text file along with the title, member, species, superfamily and keyword indexes. The quarterly and weekly updates can be accessed via the WWW at pir.georgetown.edu. The quarterly updates can also be obtained by anonymous FTP from the PIR FTP site at NBRF.Georgetown.edu, directory [ANONYMOUS.PIR.ALIGNMENT].
منابع مشابه
Database of protein sequence alignments: PIR-ALN
The Protein Information Resource (PIR) has been maintaining a database of curated protein sequence alignments since 1991. The collection includes superfamily, family and homology domain alignments. CLUSTAL V/W is used to generate multiple sequence alignments and ALNED, an interactive alignment editor, is used to check and correct them. The database has helped in classifying sequences, in defini...
متن کاملProClass protein family database
ProClass is a protein family database that organizes non-redundant sequence entries into families defined collectively by PROSITE patterns and PIR superfamilies. By combining global similarities and functional motifs into a single classification scheme, ProClass helps to reveal domain and family relationships and classify multi-domain proteins. The database currently consists of more than 120 0...
متن کاملProclass protein family database: new version with motif alignments.
ProClass is a protein family database which organizes non-redundant sequence entries into families defined collectively by the ProSite patterns and PIR superfamilies. The database consists of about 100,000 entries, more than half of which are classified in about 3,000 families. The new version includes links to various protein family/domain and structural class databases and contains gapped mot...
متن کاملThe Protein Information Resource (PIR)
The Protein Information Resource (PIR) produces the largest, most comprehensive, annotated protein sequence database in the public domain, the PIR-International Protein Sequence Database, in collaboration with the Munich Information Center for Protein Sequences (MIPS) and the Japan International Protein Sequence Database (JIPID). The expanded PIR WWW site allows sequence similarity and text sea...
متن کاملPIRSF: family classi®cation system at the Protein Information Resource
The Protein Information Resource (PIR) is an integrated public resource of protein informatics. To facilitate the sensible propagation and standardization of protein annotation and the systematic detection of annotation errors, PIR has extended its superfamily concept and developed the SuperFamily (PIRSF) classi®cation system. Based on the evolutionary relationships of whole proteins, this clas...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Bioinformatics
دوره 15 5 شماره
صفحات -
تاریخ انتشار 1999